563 research outputs found

    iREAD: a tool for intron retention detection from RNA-seq data.

    Get PDF
    BACKGROUND: Intron retention (IR) has been traditionally overlooked as \u27noise\u27 and received negligible attention in the field of gene expression analysis. In recent years, IR has become an emerging field for interrogating transcriptomes because it has been recognized to carry out important biological functions such as gene expression regulation and it has been found to be associated with complex diseases such as cancers. However, methods for detecting IR today are limited. Thus, there is a need to develop novel methods to improve IR detection. RESULTS: Here we present iREAD (intron REtention Analysis and Detector), a tool to detect IR events genome-wide from high-throughput RNA-seq data. The command line interface for iREAD is implemented in Python. iREAD takes as input a BAM file, representing the transcriptome, and a text file containing the intron coordinates of a genome. It then 1) counts all reads that overlap intron regions, 2) detects IR events by analyzing the features of reads such as depth and distribution patterns, and 3) outputs a list of retained introns into a tab-delimited text file. iREAD provides significant added value in detecting IR compared with output from IRFinder with a higher AUC on all datasets tested. Both methods showed low false positive rates and high false negative rates in different regimes, indicating that use together is generally beneficial. The output from iREAD can be directly used for further exploratory analysis such as differential intron expression and functional enrichment. The software is freely available at https://github.com/genemine/iread. CONCLUSION: Being complementary to existing tools, iREAD provides a new and generic tool to interrogate poly-A enriched transcriptomic data of intron regions. Intron retention analysis provides a complementary approach for understanding transcriptome

    A Cell-Surface Membrane Protein Signature for Glioblastoma.

    Get PDF
    We present a systems strategy that facilitated the development of a molecular signature for glioblastoma (GBM), composed of 33 cell-surface transmembrane proteins. This molecular signature, GBMSig, was developed through the integration of cell-surface proteomics and transcriptomics from patient tumors in the REMBRANDT (n = 228) and TCGA datasets (n = 547) and can separate GBM patients from control individuals with a Matthew\u27s correlation coefficient value of 0.87 in a lock-down test. Functionally, 17/33 GBMSig proteins are associated with transforming growth factor β signaling pathways, including CD47, SLC16A1, HMOX1, and MRC2. Knockdown of these genes impaired GBM invasion, reflecting their role in disease-perturbed changes in GBM. ELISA assays for a subset of GBMSig (CD44, VCAM1, HMOX1, and BIGH3) on 84 plasma specimens from multiple clinical sites revealed a high degree of separation of GBM patients from healthy control individuals (area under the curve is 0.98 in receiver operating characteristic). In addition, a classifier based on these four proteins differentiated the blood of pre- and post-tumor resections, demonstrating potential clinical value as biomarkers

    Reproducible big data science: A case study in continuous FAIRness.

    Get PDF
    Big biomedical data create exciting opportunities for discovery, but make it difficult to capture analyses and outputs in forms that are findable, accessible, interoperable, and reusable (FAIR). In response, we describe tools that make it easy to capture, and assign identifiers to, data and code throughout the data lifecycle. We illustrate the use of these tools via a case study involving a multi-step analysis that creates an atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data. We show how the tools automate routine but complex tasks, capture analysis algorithms in understandable and reusable forms, and harness fast networks and powerful cloud computers to process data rapidly, all without sacrificing usability or reproducibility-thus ensuring that big data are not hard-to-(re)use data. We evaluate our approach via a user study, and show that 91% of participants were able to replicate a complex analysis involving considerable data volumes

    Metabolic Network Analysis Reveals Altered Bile Acid Synthesis and Metabolism in Alzheimer\u27s Disease.

    Get PDF
    Increasing evidence suggests Alzheimer\u27s disease (AD) pathophysiology is influenced by primary and secondary bile acids, the end product of cholesterol metabolism. We analyze 2,114 post-mortem brain transcriptomes and identify genes in the alternative bile acid synthesis pathway to be expressed in the brain. A targeted metabolomic analysis of primary and secondary bile acids measured from post-mortem brain samples of 111 individuals supports these results. Our metabolic network analysis suggests that taurine transport, bile acid synthesis, and cholesterol metabolism differ in AD and cognitively normal individuals. We also identify putative transcription factors regulating metabolic genes and influencing altered metabolism in AD. Intriguingly, some bile acids measured in brain tissue cannot be explained by the presence of enzymes responsible for their synthesis, suggesting that they may originate from the gut microbiome and are transported to the brain. These findings motivate further research into bile acid metabolism in AD to elucidate their possible connection to cognitive decline

    Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types.

    Get PDF
    Characterizing the tissue-specific binding sites of transcription factors (TFs) is essential to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting enables the prediction of genome-wide binding sites for hundreds of TFs simultaneously. Despite the public availability of high-quality DNase-seq data from hundreds of samples, a comprehensive, up-to-date resource for the locations of genomic footprints is lacking. Here, we develop a scalable footprinting workflow using two state-of-the-art algorithms: Wellington and HINT. We apply our workflow to detect footprints in 192 ENCODE DNase-seq experiments and predict the genomic occupancy of 1,515 human TFs in 27 human tissues. We validate that these footprints overlap true-positive TF binding sites from ChIP-seq. We demonstrate that the locations, depth, and tissue specificity of footprints predict effects of genetic variants on gene expression and capture a substantial proportion of genetic risk for complex traits

    Partial inhibition of mitochondrial complex I ameliorates Alzheimer\u27s disease pathology and cognition in APP/PS1 female mice.

    Get PDF
    Alzheimer\u27s Disease (AD) is a devastating neurodegenerative disorder without a cure. Here we show that mitochondrial respiratory chain complex I is an important small molecule druggable target in AD. Partial inhibition of complex I triggers the AMP-activated protein kinase-dependent signaling network leading to neuroprotection in symptomatic APP/PS1 female mice, a translational model of AD. Treatment of symptomatic APP/PS1 mice with complex I inhibitor improved energy homeostasis, synaptic activity, long-term potentiation, dendritic spine maturation, cognitive function and proteostasis, and reduced oxidative stress and inflammation in brain and periphery, ultimately blocking the ongoing neurodegeneration. Therapeutic efficacy in vivo was monitored using translational biomarkers FDG-PET, 31P NMR, and metabolomics. Cross-validation of the mouse and the human transcriptomic data from the NIH Accelerating Medicines Partnership-AD database demonstrated that pathways improved by the treatment in APP/PS1 mice, including the immune system response and neurotransmission, represent mechanisms essential for therapeutic efficacy in AD patients
    • …
    corecore